fix: Skip malformed SSE events instead of crashing (AI_JSONParseError)#170
fix: Skip malformed SSE events instead of crashing (AI_JSONParseError)#170
Conversation
Adding CLAUDE.md with task information for AI processing. This file will be removed when the task is complete. Issue: #169
Add StreamParseError as a retryable error type to handle malformed JSON in SSE streams from AI providers. This fixes premature retry failures when providers return corrupted streaming responses (e.g., concatenated SSE chunks, invalid JSON). Changes: - Add StreamParseError type with isRetryable: true - Detect AI_JSONParseError, JSON parsing failures, and malformed JSON errors - Retry stream parse errors with exponential backoff (1s, 2s, 4s up to 3 retries) - Add streamParseErrorDelay() function for consistent retry timing - Add comprehensive test coverage for StreamParseError detection This ensures the agent's 7-day retry window works for all transient errors, not just HTTP 429 rate limits and socket errors. Fixes #169
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This reverts commit 8c61702.
🤖 Solution Draft LogThis log file contains the complete execution trace of the AI solution draft process. 💰 Cost estimation:
Now working session is ended, feel free to review and add any feedback on the solution draft. |
🔄 Auto-restart 1/3Detected uncommitted changes from previous run. Starting new session to review and commit them. Uncommitted files: Auto-restart will stop after changes are committed or after 2 more iterations. Please wait until working session will end and give your feedback. |
Add the complete execution log from the solve session that triggered issue #169. This log shows the AI_JSONParseError from malformed SSE stream data and the premature session termination. The log is named original-log.txt (instead of .log) to avoid gitignore rules while maintaining consistency with other case studies. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
🔄 Auto-restart 1/3 LogThis log file contains the complete execution trace of the AI solution draft process. 💰 Cost estimation:
Now working session is ended, feel free to review and add any feedback on the solution draft. |
I think your analysis is wrong, we don't use OpenRouter, but OpenCode Zen. Also there no retry-able error here. That is Double check the best experience from fresh version of https://github.com/anomalyco/opencode, https://github.com/google-gemini/gemini-cli, https://github.com/QwenLM/qwen-code, https://github.com/openai/codex. You may want to use subagents to save your context on exploration of these after download to temporary folders. I need you to redo the analysis, file the issue to all responsible parties if not yet filed. And propose solutions or at least workarounds. Please download all logs and data related about the issue to this repository, make sure we compile that data to If issue related to any other repository/project, where we can report issues on GitHub, please do so. Each issue must contain reproducible examples, workarounds and suggestions for fix the issue in code. |
|
🤖 AI Work Session Started Starting automated work session at 2026-02-14T10:47:15.681Z The PR has been converted to draft mode while work is in progress. This comment marks the beginning of an AI work session. Please wait for the session to finish, and provide your feedback. |
Corrected analysis: provider is OpenCode Zen via Kilo AI Gateway (not OpenRouter as previously stated). Added comprehensive comparison of 4 CLI agents (Codex, Gemini, Qwen, OpenCode) showing how each handles SSE stream parse errors. Key findings: - OpenAI Codex skips bad SSE events and continues (best practice) - Vercel AI SDK AI_JSONParseError has no isRetryable property - OpenCode upstream has the same gap (filed anomalyco/opencode#13579) - Kilo Gateway corrupts SSE chunks when proxying Kimi K2.5 Filed upstream issues: - vercel/ai#12595: AI_JSONParseError should support retry - Kilo-Org/kilocode#5875: SSE stream corruption with Kimi K2.5 - anomalyco/opencode#13579: AI_JSONParseError not retried Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
🤖 Solution Draft LogThis log file contains the complete execution trace of the AI solution draft process. 💰 Cost estimation:
Now working session is ended, feel free to review and add any feedback on the solution draft. |
|
That is absolutely wrong, this error is absolutely not retry-able, we need to find root cause and fix it. Explain how we got involved The model was set too I don't understand what is going on. Please make the full picture crystal clear. |
|
🤖 AI Work Session Started Starting automated work session at 2026-02-14T11:23:17.557Z The PR has been converted to draft mode while work is in progress. This comment marks the beginning of an AI work session. Please wait for the session to finish, and provide your feedback. |
…#169 When AI gateways (e.g. OpenCode Zen) corrupt SSE stream chunks when proxying provider responses (e.g. Kimi K2.5), the Vercel AI SDK emits an error event with AI_JSONParseError but continues the stream. Previously, the processor threw on all error events, terminating the session after a single corrupted chunk. Now, following the OpenAI Codex approach (skip-and-continue), the processor detects JSONParseError in stream error events, logs a warning, and continues processing subsequent valid chunks. Changes: - Skip JSONParseError in processor.ts error handler (Codex approach) - Remove StreamParseError type from message-v2.ts (not retryable) - Remove stream parse error retry infrastructure from retry.ts - Update tests to verify new behavior Fixes #169 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Corrected the investigation: - Provider chain is OpenCode Zen (opencode.ai/zen/v1), NOT Kilo Gateway - Root cause is in processor.ts (throw on error event), NOT in error classification - Fix approach is skip-and-continue (Codex pattern), NOT retry - Updated CLI agent comparison with accurate findings - Updated upstream issue drafts with correct provider information Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
🤖 Solution Draft LogThis log file contains the complete execution trace of the AI solution draft process. 💰 Cost estimation:
Now working session is ended, feel free to review and add any feedback on the solution draft. |
Summary
AI_JSONParseErrorinprocessor.tsstream error handler instead of throwing (Codex approach)StreamParseErrortype and retry infrastructure (not retryable — skippable)Fixes #169
Root Cause
The infrastructure chain is:
Agent (Bun) → OpenCode Zen (opencode.ai/zen/v1) → Moonshot Kimi K2.5 APIEvidence from logs:
NOT Kilo AI Gateway —
resolveShortModelName()inprovider.ts:1452prefersopencodefor shared models.The OpenCode Zen gateway returned corrupted SSE data — two chunks concatenated:
The Vercel AI SDK handles this correctly:
safeParseJSON()catches the parse failure{ type: 'error', error: JSONParseError }eventBut
processor.ts:208hadcase 'error': throw value.error;— this terminated the session.Fix: Skip-and-Continue (OpenAI Codex Approach)
This error is NOT retryable — it's skippable. The stream continues after the error event.
CLI Agent Comparison
debug!("Failed to parse SSE event"); continue;throw e;in SDKcase 'error': throw value.error;if (JSONParseError.isInstance(...)) continue;Changes
js/src/session/processor.tsjs/src/session/message-v2.tsjs/src/session/retry.tsjs/tests/stream-parse-error.test.jsjs/tests/retry-state.test.jsdocs/case-studies/issue-169/*.mdUpstream Issues Filed
Test Plan
fromError()(irrelevant now — we skip before it reaches there)🤖 Generated with Claude Code